Skip to content

fix(workflow-executor): drop dead-branch steps from context (PRD-433)#1623

Open
hercemer42 wants to merge 9 commits into
feat/prd-214-server-step-mapperfrom
fix/prd-433-revise-step-run-context
Open

fix(workflow-executor): drop dead-branch steps from context (PRD-433)#1623
hercemer42 wants to merge 9 commits into
feat/prd-214-server-step-mapperfrom
fix/prd-433-revise-step-run-context

Conversation

@hercemer42
Copy link
Copy Markdown

@hercemer42 hercemer42 commented Jun 3, 2026

fixes PRD-433

Problem

Revising an earlier workflow step did not discard the steps that originally ran after it. The orchestrator marks them correctly (revised on the pivot card, cancelled on the downstream branch, still-valid clones re-appended with originalStepIndex), but the executor read the run history branch-blind in two places:

  1. toPreviousSteps filtered only on done && stepIndex < current → dead-branch entries polluted the AI context for every step type.
  2. The record pool (getAvailableRecordRefs) was built from the raw local RunStore → records loaded by superseded steps were still offered as sources (revising "Load store" proposed loading from the dead branch's owner instead of reloading the store), while live clones (new stepIndex) couldn't be matched to their stored records.

Fix (executor-only)

  • toPreviousSteps now mirrors the orchestrator's own live-path filter — !revised && !cancelled — so dead-branch steps never reach the AI context or the record pool.
  • originalStepIndex (already on the wire in the orchestrator's history) is declared in ServerStepHistory and forwarded onto each mapped Step. It is set on a revision clone — a still-valid step the orchestrator re-injects — and points at the step the clone copies, where that step's record lives.
  • BaseStepExecutor.resolveStepExecution resolves a step's stored execution own-index-first: a step the executor ran has its execution at its own stepIndex; a clone (which never ran) falls back to originalStepIndex. Own-index-first is what stops a re-executed step — which has its own entry — from resurfacing the superseded original's record. Both the record pool and the previous-steps AI summary resolve through it.

This mirrors the frontend's carryForwardExecutorDataForCopiedSteps (which copies executor data from originalStepIndex to a clone's new index): the executor just resolves the record on read instead of copying it.

Why not stepName: LinkTo loops are a product feature — the same step name can legitimately appear twice on the live path — so the step index is the only safe key.

Tests

Full package suite green (925/925); the changed behavior is covered:

  • cancelled downstream steps excluded from previousSteps and the record pool
  • a revision clone resolves to its copied step's record via originalStepIndex, while a re-executed step uses its own execution — never the superseded original's (the reported bug: a revised load that yields no record shows no stale record downstream)
  • entry-point revision collapses the pool to the base record
  • double revision resolves the freshest generation; the stale one is absent from the select-record enum
  • LinkTo loop: two live instances of the same stepName keep their own records

🤖 Generated with Claude Code

Note

Drop dead-branch steps from workflow executor context on revision

  • toPreviousSteps in run-to-available-step-mapper.ts now filters out revised and cancelled history entries so only live-path steps are surfaced to executors.
  • A new toLineageStepIndexes utility computes each step's revision lineage (freshest-first), stored as lineageStepIndexes on the Step type.
  • BaseStepExecutor.resolveLineageExecution in base-step-executor.ts uses lineageStepIndexes to find the most recent valid execution for a step when building previous-step summaries.
  • RecordStepExecutor.getAvailableRecordRefs in record-step-executor.ts now sources related records from live previousSteps only, resolved via lineage, instead of scanning all RunStore executions.
  • Behavioral Change: after a workflow revision, dead-branch step outputs (records, summaries) are no longer visible to downstream steps; the record pool collapses to the base record plus live load-related steps only.

Changes since #1623 opened

  • Replaced resolveLineageExecution with resolveStepExecution in BaseStepExecutor.buildPreviousStepsMessages and BaseStepExecutor.resolveStepExecution static utility, and in RecordStepExecutor.getAvailableRecordRefs, changing execution resolution from iterating multiple lineage generation candidates to finding by the step's own stepIndex first, then falling back to originalStepIndex when present, ensuring re-executed steps never resurface superseded originals while clones inherit execution data from their copied step [03e2b08]
  • Removed lineageStepIndexes property and added originalStepIndex property to validated.execution.StepSchema in the workflow-executor package, where originalStepIndex is an optional number that points to the step a clone copies, with updated documentation describing clone inheritance semantics and warnings against keying on stepName [03e2b08]
  • Changed tryMapStep signature in run-to-available-step-mapper to remove lineageStepIndexes parameter and conditionally include originalStepIndex in returned Step objects when present on server step history, removed toLineageStepIndexes function that computed lineage arrays, and updated toPreviousSteps to call the modified tryMapStep without passing lineage data [03e2b08]
  • Updated test expectations across run-to-available-step-mapper.test.ts, base-step-executor.test.ts, load-related-record-step-executor.test.ts, read-record-step-executor.test.ts, trigger-record-action-step-executor.test.ts, and update-record-step-executor.test.ts to assert presence or absence of originalStepIndex instead of lineageStepIndexes, verify own-index-first resolution behavior, confirm clones inherit from originals, and ensure re-executed steps prefer their own execution [03e2b08]
  • Updated inline comment in RecordStepExecutor.getAvailableRecordRefs method describing load-related step record resolution logic [e17962e]
  • Clarified documentation for originalStepIndex field across ServerStepHistory interface, BaseStepExecutor.resolveStepExecution method, and StepSchema constant [76f58fb]
  • Converted BaseStepExecutor.resolveStepExecution from a static method to an instance method in the workflow-executor package [f2d6780]
  • Modified BaseStepExecutor.resolveStepExecution to prevent re-executed clone steps with error status from inheriting execution data from their original step [7979556]
  • Added test coverage in read-record-step-executor.test verifying that errored clone steps do not inherit execution data from their original step [7979556]
  • Reworded inline comment in BaseStepExecutor.resolveStepExecution method [7273844]

Macroscope summarized 584f656.

@linear-code
Copy link
Copy Markdown

linear-code Bot commented Jun 3, 2026

PRD-433

@qltysh
Copy link
Copy Markdown

qltysh Bot commented Jun 3, 2026

1 new issue

Tool Category Rule Count
qlty Structure Function with many returns (count = 4): getAvailableRecordRefs 1

@hercemer42 hercemer42 force-pushed the fix/prd-433-revise-step-run-context branch from 4c3cde3 to b608f35 Compare June 3, 2026 13:36
@hercemer42
Copy link
Copy Markdown
Author

[Claude Opus 4.8] — Review resolution on #1623

1. Inline conventions comment (test comment referencing the ticket) — agreed and fixed in b608f35.

2. qlty: "Function with many returns (count = 4): getAvailableRecordRefs" — pushing back, not fixing. The four returns are two guard-style early exits inside a flatMap callback plus the final pool return; the project's own code review principles explicitly prescribe this shape ("Return early — guard clauses at the top, avoid deep nesting"). Collapsing them into a single-exit ternary would also fight TypeScript's type narrowing on the discriminated executionResult union. 20-line function, single responsibility — restructuring would reduce readability to satisfy a counter, with no behavioral benefit. Flagging the conflict between the qlty many-returns rule and the project's return-early convention for a maintainer decision on the rule threshold.

3. CI failure (Linting & Testing workflow-executor) — root cause was not the review findings: the base branch moved under the PR (#1610 reworked the load-related executor; BelongsTo loads now go through getSingleRelatedData instead of getRelatedData). CI tests the merge result, so the revision repro test asserted a port call the merged flow no longer makes. Resolved by rebasing onto the current base and adapting the assertion to the sibling tests' shape (b608f35). Full suite on the rebased branch: 898/898 green.

@qltysh
Copy link
Copy Markdown

qltysh Bot commented Jun 3, 2026

Qlty


Coverage Impact

Unable to calculate total coverage change because base branch coverage was not found.

Modified Files with Diff Coverage (3)

RatingFile% DiffUncovered Line #s
New Coverage rating: A
packages/workflow-executor/src/executors/record-step-executor.ts100.0%
New Coverage rating: A
packages/workflow-executor/src/executors/base-step-executor.ts85.7%312
New Coverage rating: A
...workflow-executor/src/adapters/run-to-available-step-mapper.ts100.0%
Total92.3%
🤖 Increase coverage with AI coding...
In the `fix/prd-433-revise-step-run-context` branch, add test coverage for this new code:

- `packages/workflow-executor/src/executors/base-step-executor.ts` -- Line 312

🚦 See full report on Qlty Cloud »

🛟 Help
  • Diff Coverage: Coverage for added or modified lines of code (excludes deleted files). Learn more.

  • Total Coverage: Coverage for the whole repository, calculated as the sum of all File Coverage. Learn more.

  • File Coverage: Covered Lines divided by Covered Lines plus Missed Lines. (Excludes non-executable lines including blank lines and comments.)

    • Indirect Changes: Changes to File Coverage for files that were not modified in this PR. Learn more.

@hercemer42 hercemer42 force-pushed the fix/prd-433-revise-step-run-context branch 3 times, most recently from fa5f268 to 459bd01 Compare June 4, 2026 11:35
@hercemer42 hercemer42 marked this pull request as ready for review June 4, 2026 11:46
hercemer42 and others added 3 commits June 4, 2026 14:33
Revising a step forks the run history: the orchestrator stamps the pivot
card revised and the downstream branch cancelled, then re-appends live
clones. The executor read that history branch-blind, so re-executed
steps inherited stale AI context and the record pool offered records
loaded by superseded steps (e.g. revising "Load store" proposed loading
from the dead branch's owner).

previousSteps now mirrors the orchestrator's own live-path filter, and
each step carries lineageStepIndexes so executions persisted in the
RunStore under a clone's original index resolve to the freshest
generation — never to a dead branch, never by stepName (LinkTo loops
make names non-unique on the live path).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The base branch's load-related rework routes BelongsTo loads through
getSingleRelatedData instead of getRelatedData; the revision repro test
asserted the old port call. Also rewords a test comment that referenced
the ticket (review feedback).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ationale

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Scra3 Scra3 force-pushed the fix/prd-433-revise-step-run-context branch from 459bd01 to 584f656 Compare June 4, 2026 12:35
Comment thread packages/workflow-executor/src/adapters/server-types.ts Outdated
Comment thread packages/workflow-executor/src/executors/record-step-executor.ts
Comment on lines +28 to +31
// RunStore execution lookup candidates, own stepIndex first then earlier same-step
// generations descending. Revision clones run under a new stepIndex while their execution
// data stays keyed under the original one. Absent (legacy producers) means "own index only".
lineageStepIndexes: z.array(z.number().int().nonnegative()).nonempty().optional(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove: we shouldn't pollute executor data with something used only for computing the real data

@hercemer42
Copy link
Copy Markdown
Author

hercemer42 commented Jun 4, 2026

Comment from Alban :

Found bug :

  1. load a related dvd
  2. display dvd name
  3. revise step 1, load fails, next
  4. display dvd name => it displays the dvd loaded in step 1, but it was revised.

@hercemer42
Copy link
Copy Markdown
Author

Comments from Enki :

I think we should actually copy the data for copied steps. It would make the code simpler for a negligible memory cost - the data you fetch with the resolveLineageExecution method

we should fetch & copy the data at the adapter level, so that executors don’t care about steps that are copied or not

hercemer42 and others added 3 commits June 4, 2026 17:19
…front

Replaces the lineage-walk with the frontend's carry-forward model: a step's
record comes from its own RunStore entry, falling back to originalStepIndex
only for a revision clone the executor never ran. Own-index-first stops a
re-executed step that produced no record (failed / skipped / handled
manually) from resurfacing its superseded original's record. Drops
lineageStepIndexes from the Step type and the lineage grouping from the
mapper.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…n-index-first

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ined

Drop frontend cross-references and the contested orchestrator-internal claim
about re-executions carrying originalStepIndex; describe only what the
executor uses the field for.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread packages/workflow-executor/src/executors/base-step-executor.ts Outdated
hercemer42 and others added 3 commits June 5, 2026 15:52
…thod

Per PR review: resolveStepExecution was static. Since RecordStepExecutor
extends BaseStepExecutor, make it a protected instance method and call it
via `this.`. Pure helper, no behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A re-executed revision clone that errors persists no execution of its own
(load-related only saves on success), so own-index resolution finds nothing
and the originalStepIndex fallback resurfaced the superseded original's
record. Gate the fallback on a non-error outcome.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants